Search CORE

22 research outputs found

RFaaS: RDMA-Enabled FaaS Platform for Serverless High-Performance Computing

Author: Calotoiu Alexandru
Copik Marcin
Hoefler Torsten
Taranov Konstantin
Publication venue
Publication date: 25/06/2021
Field of study

The rigid MPI programming model and batch scheduling dominate high-performance computing. While clouds brought new levels of elasticity into the world of computing, supercomputers still suffer from low resource utilization rates. To enhance supercomputing clusters with the benefits of serverless computing, a modern cloud programming paradigm for pay-as-you-go execution of stateless functions, we present rFaaS, the first RDMA-aware Function-as-a-Service (FaaS) platform. With hot invocations and decentralized function placement, we overcome the major performance limitations of FaaS systems and provide low-latency remote invocations in multi-tenant environments. We evaluate the new serverless system through a series of microbenchmarks and show that remote functions execute with negligible performance overheads. We demonstrate how serverless computing can bring elastic resource management into MPI-based high-performance applications. Overall, our results show that MPI applications can benefit from modern cloud programming paradigms to guarantee high performance at lower resource costs

arXiv.org e-Print Archive

Data management in modern RDMA-capable networks

Author: Taranov Konstantin
Publication venue: ETH Zurich
Publication date: 01/01/2022
Field of study

Current trends in modern hardware create many opportunities and challenges for data management systems. More specifically, novel high-performance network devices with the capability of high-throughput packet processing offer the potential of significant improvements in the performance of networked systems. However, advanced in-network acceleration and offload capabilities, including RDMA, often have strong constraints which force developers to limit the functionality of designed systems or even sacrifice their security. This dissertation aims to address challenges in the design and implementation of the data management protocols that efficiently use the offload capabilities offered by modern network accelerators in the context of various data storage and data analytics systems. Particularly, I address challenges in data management for key-value stores, shared message queues, and remote memory systems regarding storage reliability, performance, and memory fragmentation. Second, I propose a serialization-free communication library for Java virtual machines that allows applications to send on-heap objects through RDMA connections. I show how the library unlocks RDMA networking to Java virtual machines hiding all the burden of low-level RDMA programming from the users. Finally, I propose an extension to the InfiniBand architecture that enables authentication and encryption for RDMA networking to prevent information leakage and message tampering. I show how providers can implement RDMA secure channels with minimal changes to the existing InfiniBand protocol and with minor performance overheads. I conclude by discussing future research directions which arise from the work presented in this dissertation, and highlighting the potential of in-network processing for modern data management platforms

Repository for Publications and Research Data

CoRM: Compactable Remote Memory over RDMA

Author: Di Girolamo Salvatore
Hoefler Torsten
Taranov Konstantin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/06/2021
Field of study

Distributed memory systems are becoming increasingly important since they provide a system-scale abstraction where physically separated memories can be addressed as a single logical one. This abstraction enables memory disaggregation, allowing systems as in-memory databases, caching services, and ephemeral storage to be naturally deployed at large scales. While this abstraction effectively increases the memory capacity of these systems, it faces additional overheads for remote memory accesses. To narrow the difference between local and remote accesses, low latency RDMA networks are a key element for efficient memory disaggregation. However, RDMA acceleration poses new obstacles to efficient memory management and particularly to memory compaction: network controllers and CPUs can concurrently access memory, potentially leading to inconsistencies if memory management operations are not synchronized. To ensure consistency, most distributed memory systems do not provide memory compaction and are exposed to memory fragmentation. We introduce CoRM, an RDMA-accelerated shared memory system that supports memory compaction and ensures strict consistency while providing one-sided RDMA accesses. We show that CoRM sustains high read throughput during normal operations, comparable to similar systems not providing memory compaction while experiencing minimal overheads during compaction. CoRM never disrupts RDMA connections and can reduce applications' active memory up to 6x by performing memory compaction

Repository for Publications and Research Data

Naos: Serialization-free RDMA networking in Java

Author: Alonso Gustavo
Bruno Rodrigo
Hoefler Torsten
Taranov Konstantin
Publication venue: USENIX Association
Publication date: 01/07/2021
Field of study

Managed languages such as Java and Scala do not allow developers to directly access heap objects. As a result, to send on-heap data over the network, it has to be explicitly converted to byte streams before sending and converted back to objects after receiving. The technique, also known as object serialization/deserialization, is an expensive procedure limiting the performance of JVM-based distributed systems as it induces additional memory copies and requires data transformation resulting in high CPU and memory bandwidth consumption. This paper presents Naos, a JVM-based technique bypassing heap serialization boundaries that allows objects to be directly sent from a local heap to a remote one with minimal CPU involvement and over RDMA networks. As Naos eliminates the need to copy and transform objects, and enables asynchronous communication, it offers significant speedups compared to state-of-the-art serialization libraries. Naos exposes a simple high level API hiding the complexity of the RDMA protocol that transparently allows JVM-based systems to take advantage of offloaded RDMA networking

Repository for Publications and Research Data

rFaaS: Enabling High Performance Serverless with RDMA and Leases

Author: Calotoiu Alexandru
Copik Martin
Hoefler Torsten
Taranov Konstantin
Publication venue: IEEE
Publication date: 01/01/2023
Field of study

High performance is needed in many computing systems, from batch-managed supercomputers to general-purpose cloud platforms. However, scientific clusters lack elastic parallelism, while clouds cannot offer competitive costs for highperformance applications. In this work, we investigate how modern cloud programming paradigms can bring the elasticity needed to allocate idle resources, decreasing computation costs and improving overall data center efficiency. Function-as-aService (FaaS) brings the pay-as-you-go execution of stateless functions, but its performance characteristics cannot match coarse-grained cloud and cluster allocations. To make serverless computing viable for high-performance and latency-sensitive applications, we present rFaaS, an RDMA-accelerated FaaS platform. We identify critical limitations of serverless - centralized scheduling and inefficient network transport - and improve the FaaS architecture with allocation leases and microsecond invocations. We show that our remote functions add only negligible overhead on top of the fastest available networks, and we decrease the execution latency by orders of magnitude compared to contemporary FaaS systems. Furthermore, we demonstrate the performance of rFaaS by evaluating real-world FaaS benchmarks and parallel applications. Overall, our results show that new allocation policies and remote memory access help FaaS applications achieve high performance and bring serverless computing to HPC

Repository for Publications and Research Data

KafkaDirect: Zero-copy Data Access for Apache Kafka over RDMA Networks

Author: Byan Steve
Hoefler Torsten
Hoefler Torsten
Taranov Konstantin
Publication venue: Association for Computing Machinery
Publication date: 01/06/2022
Field of study

Apache Kafka is an open-source distributed publish-subscribe system, which is widely used in data centers for messaging between applications, log aggregation, and stream processing. The existing Kafka implementation uses TCP/IP for communication, which has various inefficiencies such as a high message dispatch cost due to OS involvement and excessive memory copies. Recently, the availability of cost-effective RDMA-capable network controllers within data centers and cloud infrastructures have encouraged many modern applications to adopt RDMA networking, which offers the potential to outperform classical TCP/IP. We introduce KafkaDirect, an extension to Apache Kafka, that uses RDMA to accelerate the three most network intensive datapaths: record production, record replication, and record consumption. In this work, we explore the design choices including which RDMA operations to use to take full advantage of offloaded communication. Our RDMA design relies on one-sided RDMA requests to attain true zero-copy communication completely avoiding the need for using intermediate buffers in Kafka servers, thereby ensuring low latency and high throughput communication. KafkaDirect can offer up to 9x increase in throughput for both Kafka producers and Kafka consumers, and can provide 4x and 50x reduction in latency for Kafka producers and Kafka consumers, respectively

Repository for Publications and Research Data

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

FaasKeeper: a Blueprint for Serverless Services

Author: Calotoiu Alexandru
Copik Marcin
Hoefler Torsten
Taranov Konstantin
Publication venue
Publication date: 28/03/2022
Field of study

FaaS (Function-as-a-Service) brought a fundamental shift into cloud computing: (persistent) virtual machines have been replaced with dynamically allocated resources, trading locality and statefulness for a pay-as-you-go model more suitable for varying and infrequent workloads. However, adapting services to function within the serverless paradigm while still fulfilling requirements is challenging. In this work, we introduce a design blueprint for creating complex serverless services and contribute a set of requirements for efficient and scalable FaaS computing. To showcase our approach, we focus on ZooKeeper, a centralized coordination service that offers a safe and wait-free consensus mechanism but requires a persistent allocation of computing resources that does not offer the flexibility needed to handle variable workloads. We design FaaSKeeper, the first coordination service built on serverless functions and cloud-native services. FaaSKeeper provides the same consistency guarantees and interface as ZooKeeper with a price model proportional to the activity in the system. In addition, we define synchronization primitives to extend the capabilities of scalable cloud storage ser- vices with consensus semantics needed for strong data consistency

arXiv.org e-Print Archive

High DNA melting activity of extremophyte Eutrema salsugineum cold shock domain proteins EsCSDP1 and EsCSDP3

Author: Aleksei Babakov
Konstantin Blagodatskikh
Konstantin Evlakov
Nikolai Zlobin
Vasiliy Taranov
Yakov Alekseev
Publication venue: 'Elsevier BV'
Publication date: 01/03/2016
Field of study

Plant cold shock domain proteins (CSDP) participate in maintenance of plant stress tolerance and in regulating their development. In the present paper we show that two out of three extremophyte plant Eutrema salsugineum proteins EsCSDP1-3, namely EsCSDP1 and EsCSDP3, possess high DNA-melting activity. DNA-melting activity of proteins was evaluated using molecular beacon assay in two ways: by measuring Tm parameter (the temperature at which half of the DNA beacon molecules is fully melted) and the beacon fluorescence at 4 °C. As the ratio protein/beacon was increased, a decrease in Tm was observed. Besides DNA-melting activity of full proteins, activity was measured for three isolated cold shock domains EsCSD1-3, C-terminal domain of EsCSDP1 (EsZnF1), as well as a mixture of EsCSD1 and EsZnF1. The Tm reduction efficiency of proteins formed the following sequence: EsCSDP3≈EsCSDP1>(EsCSD1+EsZnF1)>EsZnF1>EsCSDP2. Only full proteins EsCSDP3 and EsCSDP1 demonstrated DNA-melting activity at 4 °C. The presented experimental data indicate that i: interaction of EsCSDP1-3 with beacon single-stranded region is obligatory for efficient melting; ii: cold shock domain and C-terminal domain with zinc finger motifs should be present in one protein molecule to have high melting activity

Elsevier - Publisher Connector

Directory of Open Access Journals